In statistics, the generalized Pareto distribution (GPD) is a family of continuous probability distributions. It is often used to model the tails of another distribution. It is specified by three parameters: location μ {\displaystyle \mu } , scale σ {\displaystyle \sigma } , and shape ξ {\displaystyle \xi } . Sometimes it is specified by only scale and shape and sometimes only by its shape parameter. Some references give the shape parameter as κ = − ξ {\displaystyle \kappa =-\xi \,} .
With shape ξ > 0 {\displaystyle \xi >0} and location μ = σ / ξ {\displaystyle \mu =\sigma /\xi } , the GPD is equivalent to the Pareto distribution with scale x m = σ / ξ {\displaystyle x_{m}=\sigma /\xi } and shape α = 1 / ξ {\displaystyle \alpha =1/\xi } .
Definition
The cumulative distribution function of X ∼ G P D ( μ , σ , ξ ) {\displaystyle X\sim GPD(\mu ,\sigma ,\xi )} ( μ ∈ R {\displaystyle \mu \in \mathbb {R} } , σ > 0 {\displaystyle \sigma >0} , and ξ ∈ R {\displaystyle \xi \in \mathbb {R} } ) is
F ( μ , σ , ξ ) ( x ) = { 1 − ( 1 + ξ ( x − μ ) σ ) − 1 / ξ for ξ ≠ 0 , 1 − exp ( − x − μ σ ) for ξ = 0 , {\displaystyle F_{(\mu ,\sigma ,\xi )}(x)={\begin{cases}1-\left(1+{\frac {\xi (x-\mu )}{\sigma }}\right)^{-1/\xi }&{\text{for }}\xi \neq 0,\\1-\exp \left(-{\frac {x-\mu }{\sigma }}\right)&{\text{for }}\xi =0,\end{cases}}}where the support of X {\displaystyle X} is x ⩾ μ {\displaystyle x\geqslant \mu } when ξ ⩾ 0 {\displaystyle \xi \geqslant 0\,} , and μ ⩽ x ⩽ μ − σ / ξ {\displaystyle \mu \leqslant x\leqslant \mu -\sigma /\xi } when ξ < 0 {\displaystyle \xi <0} .
The probability density function (pdf) of X ∼ G P D ( μ , σ , ξ ) {\displaystyle X\sim GPD(\mu ,\sigma ,\xi )} is
f ( μ , σ , ξ ) ( x ) = 1 σ ( 1 + ξ ( x − μ ) σ ) ( − 1 ξ − 1 ) {\displaystyle f_{(\mu ,\sigma ,\xi )}(x)={\frac {1}{\sigma }}\left(1+{\frac {\xi (x-\mu )}{\sigma }}\right)^{\left(-{\frac {1}{\xi }}-1\right)}} ,again, for x ⩾ μ {\displaystyle x\geqslant \mu } when ξ ⩾ 0 {\displaystyle \xi \geqslant 0} , and μ ⩽ x ⩽ μ − σ / ξ {\displaystyle \mu \leqslant x\leqslant \mu -\sigma /\xi } when ξ < 0 {\displaystyle \xi <0} .
The pdf is a solution of the following differential equation:
{ f ′ ( x ) ( − μ ξ + σ + ξ x ) + ( ξ + 1 ) f ( x ) = 0 , f ( 0 ) = ( 1 − μ ξ σ ) − 1 ξ − 1 σ } {\displaystyle \left\{{\begin{array}{l}f'(x)(-\mu \xi +\sigma +\xi x)+(\xi +1)f(x)=0,\\f(0)={\frac {\left(1-{\frac {\mu \xi }{\sigma }}\right)^{-{\frac {1}{\xi }}-1}}{\sigma }}\end{array}}\right\}}The standard cumulative distribution function (cdf) of the GPD is defined using z = x − μ σ {\displaystyle z={\frac {x-\mu }{\sigma }}} 5
F ξ ( z ) = { 1 − ( 1 + ξ z ) − 1 / ξ for ξ ≠ 0 , 1 − e − z for ξ = 0. {\displaystyle F_{\xi }(z)={\begin{cases}1-\left(1+\xi z\right)^{-1/\xi }&{\text{for }}\xi \neq 0,\\1-e^{-z}&{\text{for }}\xi =0.\end{cases}}}where the support is z ≥ 0 {\displaystyle z\geq 0} for ξ ≥ 0 {\displaystyle \xi \geq 0} and 0 ≤ z ≤ − 1 / ξ {\displaystyle 0\leq z\leq -1/\xi } for ξ < 0 {\displaystyle \xi <0} . The corresponding probability density function (pdf) is
f ξ ( z ) = { ( 1 + ξ z ) − ξ + 1 ξ for ξ ≠ 0 , e − z for ξ = 0. {\displaystyle f_{\xi }(z)={\begin{cases}(1+\xi z)^{-{\frac {\xi +1}{\xi }}}&{\text{for }}\xi \neq 0,\\e^{-z}&{\text{for }}\xi =0.\end{cases}}}Special cases
- If the shape ξ {\displaystyle \xi } and location μ {\displaystyle \mu } are both zero, the GPD is equivalent to the exponential distribution.
- With shape ξ = − 1 {\displaystyle \xi =-1} , the GPD is equivalent to the continuous uniform distribution U ( 0 , σ ) {\displaystyle U(0,\sigma )} .6
- With shape ξ > 0 {\displaystyle \xi >0} and location μ = σ / ξ {\displaystyle \mu =\sigma /\xi } , the GPD is equivalent to the Pareto distribution with scale x m = σ / ξ {\displaystyle x_{m}=\sigma /\xi } and shape α = 1 / ξ {\displaystyle \alpha =1/\xi } .
- If X {\displaystyle X} ∼ {\displaystyle \sim } G P D {\displaystyle GPD} ( {\displaystyle (} μ = 0 {\displaystyle \mu =0} , σ {\displaystyle \sigma } , ξ {\displaystyle \xi } ) {\displaystyle )} , then Y = log ( X ) ∼ e x G P D ( σ , ξ ) {\displaystyle Y=\log(X)\sim exGPD(\sigma ,\xi )} [1]. (exGPD stands for the exponentiated generalized Pareto distribution.)
- GPD is similar to the Burr distribution.
Generating generalized Pareto random variables
Generating GPD random variables
If U is uniformly distributed on (0, 1], then
X = μ + σ ( U − ξ − 1 ) ξ ∼ G P D ( μ , σ , ξ ≠ 0 ) {\displaystyle X=\mu +{\frac {\sigma (U^{-\xi }-1)}{\xi }}\sim GPD(\mu ,\sigma ,\xi \neq 0)}and
X = μ − σ ln ( U ) ∼ G P D ( μ , σ , ξ = 0 ) . {\displaystyle X=\mu -\sigma \ln(U)\sim GPD(\mu ,\sigma ,\xi =0).}Both formulas are obtained by inversion of the cdf.
The Pareto package in R and the gprnd command in the Matlab Statistics Toolbox can be used to generate generalized Pareto random numbers.
GPD as an Exponential-Gamma Mixture
A GPD random variable can also be expressed as an exponential random variable, with a Gamma distributed rate parameter.
X | Λ ∼ E x p ( Λ ) {\displaystyle \ X\ \vert \ \Lambda \sim \operatorname {\mathsf {Exp}} (\Lambda )\ }and
Λ ∼ G a m m a ( α , β ) {\displaystyle \ \Lambda \sim \operatorname {\mathsf {Gamma}} (\alpha ,\ \beta )\ }then
X ∼ G P D ( ξ = 1 / α , σ = β / α ) {\displaystyle \ X\sim \operatorname {\mathsf {GPD}} (\ \xi =1/\alpha ,\ \sigma =\beta /\alpha \ )\ }Notice however, that since the parameters for the Gamma distribution must be greater than zero, we obtain the additional restrictions that ξ {\displaystyle \ \xi \ } must be positive.
In addition to this mixture (or compound) expression, the generalized Pareto distribution can also be expressed as a simple ratio. Concretely, for Y ∼ E x p o n e n t i a l ( 1 ) {\displaystyle \ Y\sim \operatorname {\mathsf {Exponential}} (\ 1\ )\ } and Z ∼ G a m m a ( 1 / ξ , 1 ) , {\displaystyle \ Z\sim \operatorname {\mathsf {Gamma}} (1/\xi ,\ 1)\ ,} we have μ + σ Y ξ Z ∼ G P D ( μ , σ , ξ ) . {\displaystyle \ \mu +{\frac {\ \sigma \ Y\ }{\ \xi \ Z\ }}\sim \operatorname {\mathsf {GPD}} (\mu ,\ \sigma ,\ \xi )~.} This is a consequence of the mixture after setting β = α {\displaystyle \ \beta =\alpha \ } and taking into account that the rate parameters of the exponential and gamma distribution are simply inverse multiplicative constants.
Exponentiated generalized Pareto distribution
The exponentiated generalized Pareto distribution (exGPD)
If X ∼ G P D {\displaystyle X\sim GPD} ( {\displaystyle (} μ = 0 {\displaystyle \mu =0} , σ {\displaystyle \sigma } , ξ {\displaystyle \xi } ) {\displaystyle )} , then Y = log ( X ) {\displaystyle Y=\log(X)} is distributed according to the exponentiated generalized Pareto distribution, denoted by Y {\displaystyle Y} ∼ {\displaystyle \sim } e x G P D {\displaystyle exGPD} ( {\displaystyle (} σ {\displaystyle \sigma } , ξ {\displaystyle \xi } ) {\displaystyle )} .
The probability density function(pdf) of Y {\displaystyle Y} ∼ {\displaystyle \sim } e x G P D {\displaystyle exGPD} ( {\displaystyle (} σ {\displaystyle \sigma } , ξ {\displaystyle \xi } ) ( σ > 0 ) {\displaystyle )\,\,(\sigma >0)} is
g ( σ , ξ ) ( y ) = { e y σ ( 1 + ξ e y σ ) − 1 / ξ − 1 for ξ ≠ 0 , 1 σ e y − e y / σ for ξ = 0 , {\displaystyle g_{(\sigma ,\xi )}(y)={\begin{cases}{\frac {e^{y}}{\sigma }}{\bigg (}1+{\frac {\xi e^{y}}{\sigma }}{\bigg )}^{-1/\xi -1}\,\,\,\,{\text{for }}\xi \neq 0,\\{\frac {1}{\sigma }}e^{y-e^{y}/\sigma }\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,{\text{for }}\xi =0,\end{cases}}}where the support is − ∞ < y < ∞ {\displaystyle -\infty <y<\infty } for ξ ≥ 0 {\displaystyle \xi \geq 0} , and − ∞ < y ≤ log ( − σ / ξ ) {\displaystyle -\infty <y\leq \log(-\sigma /\xi )} for ξ < 0 {\displaystyle \xi <0} .
For all ξ {\displaystyle \xi } , the log σ {\displaystyle \log \sigma } becomes the location parameter. See the right panel for the pdf when the shape ξ {\displaystyle \xi } is positive.
The exGPD has finite moments of all orders for all σ > 0 {\displaystyle \sigma >0} and − ∞ < ξ < ∞ {\displaystyle -\infty <\xi <\infty } .
The moment-generating function of Y ∼ e x G P D ( σ , ξ ) {\displaystyle Y\sim exGPD(\sigma ,\xi )} is
M Y ( s ) = E [ e s Y ] = { − 1 ξ ( − σ ξ ) s B ( s + 1 , − 1 / ξ ) for s ∈ ( − 1 , ∞ ) , ξ < 0 , 1 ξ ( σ ξ ) s B ( s + 1 , 1 / ξ − s ) for s ∈ ( − 1 , 1 / ξ ) , ξ > 0 , σ s Γ ( 1 + s ) for s ∈ ( − 1 , ∞ ) , ξ = 0 , {\displaystyle M_{Y}(s)=E[e^{sY}]={\begin{cases}-{\frac {1}{\xi }}{\bigg (}-{\frac {\sigma }{\xi }}{\bigg )}^{s}B(s+1,-1/\xi )\,\,\,\,\,\,\,\,\,\,\,\,{\text{for }}s\in (-1,\infty ),\xi <0,\\{\frac {1}{\xi }}{\bigg (}{\frac {\sigma }{\xi }}{\bigg )}^{s}B(s+1,1/\xi -s)\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,{\text{for }}s\in (-1,1/\xi ),\xi >0,\\\sigma ^{s}\Gamma (1+s)\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,{\text{for }}s\in (-1,\infty ),\xi =0,\end{cases}}}where B ( a , b ) {\displaystyle B(a,b)} and Γ ( a ) {\displaystyle \Gamma (a)} denote the beta function and gamma function, respectively.
The expected value of Y {\displaystyle Y} ∼ {\displaystyle \sim } e x G P D {\displaystyle exGPD} ( {\displaystyle (} σ {\displaystyle \sigma } , ξ {\displaystyle \xi } ) {\displaystyle )} depends on the scale σ {\displaystyle \sigma } and shape ξ {\displaystyle \xi } parameters, while the ξ {\displaystyle \xi } participates through the digamma function:
E [ Y ] = { log ( − σ ξ ) + ψ ( 1 ) − ψ ( − 1 / ξ + 1 ) for ξ < 0 , log ( σ ξ ) + ψ ( 1 ) − ψ ( 1 / ξ ) for ξ > 0 , log σ + ψ ( 1 ) for ξ = 0. {\displaystyle E[Y]={\begin{cases}\log \ {\bigg (}-{\frac {\sigma }{\xi }}{\bigg )}+\psi (1)-\psi (-1/\xi +1)\,\,\,\,\,\,\,\,\,\,\,\,\,\,{\text{for }}\xi <0,\\\log \ {\bigg (}{\frac {\sigma }{\xi }}{\bigg )}+\psi (1)-\psi (1/\xi )\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,{\text{for }}\xi >0,\\\log \sigma +\psi (1)\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,{\text{for }}\xi =0.\end{cases}}}Note that for a fixed value for the ξ ∈ ( − ∞ , ∞ ) {\displaystyle \xi \in (-\infty ,\infty )} , the log σ {\displaystyle \log \ \sigma } plays as the location parameter under the exponentiated generalized Pareto distribution.
The variance of Y {\displaystyle Y} ∼ {\displaystyle \sim } e x G P D {\displaystyle exGPD} ( {\displaystyle (} σ {\displaystyle \sigma } , ξ {\displaystyle \xi } ) {\displaystyle )} depends on the shape parameter ξ {\displaystyle \xi } only through the polygamma function of order 1 (also called the trigamma function):
V a r [ Y ] = { ψ ′ ( 1 ) − ψ ′ ( − 1 / ξ + 1 ) for ξ < 0 , ψ ′ ( 1 ) + ψ ′ ( 1 / ξ ) for ξ > 0 , ψ ′ ( 1 ) for ξ = 0. {\displaystyle Var[Y]={\begin{cases}\psi '(1)-\psi '(-1/\xi +1)\,\,\,\,\,\,\,\,\,\,\,\,\,{\text{for }}\xi <0,\\\psi '(1)+\psi '(1/\xi )\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,{\text{for }}\xi >0,\\\psi '(1)\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,{\text{for }}\xi =0.\end{cases}}}See the right panel for the variance as a function of ξ {\displaystyle \xi } . Note that ψ ′ ( 1 ) = π 2 / 6 ≈ 1.644934 {\displaystyle \psi '(1)=\pi ^{2}/6\approx 1.644934} .
Note that the roles of the scale parameter σ {\displaystyle \sigma } and the shape parameter ξ {\displaystyle \xi } under Y ∼ e x G P D ( σ , ξ ) {\displaystyle Y\sim exGPD(\sigma ,\xi )} are separably interpretable, which may lead to a robust efficient estimation for the ξ {\displaystyle \xi } than using the X ∼ G P D ( σ , ξ ) {\displaystyle X\sim GPD(\sigma ,\xi )} [2]. The roles of the two parameters are associated each other under X ∼ G P D ( μ = 0 , σ , ξ ) {\displaystyle X\sim GPD(\mu =0,\sigma ,\xi )} (at least up to the second central moment); see the formula of variance V a r ( X ) {\displaystyle Var(X)} wherein both parameters are participated.
The Hill's estimator
Assume that X 1 : n = ( X 1 , ⋯ , X n ) {\displaystyle X_{1:n}=(X_{1},\cdots ,X_{n})} are n {\displaystyle n} observations (need not be i.i.d.) from an unknown heavy-tailed distribution F {\displaystyle F} such that its tail distribution is regularly varying with the tail-index 1 / ξ {\displaystyle 1/\xi } (hence, the corresponding shape parameter is ξ {\displaystyle \xi } ). To be specific, the tail distribution is described as
F ¯ ( x ) = 1 − F ( x ) = L ( x ) ⋅ x − 1 / ξ , for some ξ > 0 , where L is a slowly varying function. {\displaystyle {\bar {F}}(x)=1-F(x)=L(x)\cdot x^{-1/\xi },\,\,\,\,\,{\text{for some }}\xi >0,\,\,{\text{where }}L{\text{ is a slowly varying function.}}}It is of a particular interest in the extreme value theory to estimate the shape parameter ξ {\displaystyle \xi } , especially when ξ {\displaystyle \xi } is positive (so called the heavy-tailed distribution).
Let F u {\displaystyle F_{u}} be their conditional excess distribution function. Pickands–Balkema–de Haan theorem (Pickands, 1975; Balkema and de Haan, 1974) states that for a large class of underlying distribution functions F {\displaystyle F} , and large u {\displaystyle u} , F u {\displaystyle F_{u}} is well approximated by the generalized Pareto distribution (GPD), which motivated Peak Over Threshold (POT) methods to estimate ξ {\displaystyle \xi } : the GPD plays the key role in POT approach.
A renowned estimator using the POT methodology is the Hill's estimator. Technical formulation of the Hill's estimator is as follows. For 1 ≤ i ≤ n {\displaystyle 1\leq i\leq n} , write X ( i ) {\displaystyle X_{(i)}} for the i {\displaystyle i} -th largest value of X 1 , ⋯ , X n {\displaystyle X_{1},\cdots ,X_{n}} . Then, with this notation, the Hill's estimator (see page 190 of Reference 5 by Embrechts et al [3]) based on the k {\displaystyle k} upper order statistics is defined as
ξ ^ k Hill = ξ ^ k Hill ( X 1 : n ) = 1 k − 1 ∑ j = 1 k − 1 log ( X ( j ) X ( k ) ) , for 2 ≤ k ≤ n . {\displaystyle {\widehat {\xi }}_{k}^{\text{Hill}}={\widehat {\xi }}_{k}^{\text{Hill}}(X_{1:n})={\frac {1}{k-1}}\sum _{j=1}^{k-1}\log {\bigg (}{\frac {X_{(j)}}{X_{(k)}}}{\bigg )},\,\,\,\,\,\,\,\,{\text{for }}2\leq k\leq n.}In practice, the Hill estimator is used as follows. First, calculate the estimator ξ ^ k Hill {\displaystyle {\widehat {\xi }}_{k}^{\text{Hill}}} at each integer k ∈ { 2 , ⋯ , n } {\displaystyle k\in \{2,\cdots ,n\}} , and then plot the ordered pairs { ( k , ξ ^ k Hill ) } k = 2 n {\displaystyle \{(k,{\widehat {\xi }}_{k}^{\text{Hill}})\}_{k=2}^{n}} . Then, select from the set of Hill estimators { ξ ^ k Hill } k = 2 n {\displaystyle \{{\widehat {\xi }}_{k}^{\text{Hill}}\}_{k=2}^{n}} which are roughly constant with respect to k {\displaystyle k} : these stable values are regarded as reasonable estimates for the shape parameter ξ {\displaystyle \xi } . If X 1 , ⋯ , X n {\displaystyle X_{1},\cdots ,X_{n}} are i.i.d., then the Hill's estimator is a consistent estimator for the shape parameter ξ {\displaystyle \xi } [4].
Note that the Hill estimator ξ ^ k Hill {\displaystyle {\widehat {\xi }}_{k}^{\text{Hill}}} makes a use of the log-transformation for the observations X 1 : n = ( X 1 , ⋯ , X n ) {\displaystyle X_{1:n}=(X_{1},\cdots ,X_{n})} . (The Pickand's estimator ξ ^ k Pickand {\displaystyle {\widehat {\xi }}_{k}^{\text{Pickand}}} also employed the log-transformation, but in a slightly different way [5].)
See also
- Burr distribution
- Pareto distribution
- Generalized extreme value distribution
- Exponentiated generalized Pareto distribution
- Pickands–Balkema–de Haan theorem
Further reading
- Pickands, James (1975). "Statistical inference using extreme order statistics" (PDF). Annals of Statistics. 3 s: 119–131. doi:10.1214/aos/1176343003.
- Balkema, A.; De Haan, Laurens (1974). "Residual life time at great age". Annals of Probability. 2 (5): 792–804. doi:10.1214/aop/1176996548.
- Lee, Seyoon; Kim, J.H.K. (2018). "Exponentiated generalized Pareto distribution:Properties and applications towards extreme value theory". Communications in Statistics - Theory and Methods. 48 (8): 1–25. arXiv:1708.01686. doi:10.1080/03610926.2018.1441418. S2CID 88514574.
- N. L. Johnson; S. Kotz; N. Balakrishnan (1994). Continuous Univariate Distributions Volume 1, second edition. New York: Wiley. ISBN 978-0-471-58495-7. Chapter 20, Section 12: Generalized Pareto Distributions.
- Barry C. Arnold (2011). "Chapter 7: Pareto and Generalized Pareto Distributions". In Duangkamon Chotikapanich (ed.). Modeling Distributions and Lorenz Curves. New York: Springer. ISBN 9780387727967.
- Arnold, B. C.; Laguna, L. (1977). On generalized Pareto distributions with applications to income data. Ames, Iowa: Iowa State University, Department of Economics.
External links
References
Coles, Stuart (2001-12-12). An Introduction to Statistical Modeling of Extreme Values. Springer. p. 75. ISBN 9781852334598. 9781852334598 ↩
Dargahi-Noubary, G. R. (1989). "On tail estimation: An improved method". Mathematical Geology. 21 (8): 829–842. Bibcode:1989MatGe..21..829D. doi:10.1007/BF00894450. S2CID 122710961. /wiki/Bibcode_(identifier) ↩
Hosking, J. R. M.; Wallis, J. R. (1987). "Parameter and Quantile Estimation for the Generalized Pareto Distribution". Technometrics. 29 (3): 339–349. doi:10.2307/1269343. JSTOR 1269343. /wiki/Doi_(identifier) ↩
Davison, A. C. (1984-09-30). "Modelling Excesses over High Thresholds, with an Application". In de Oliveira, J. Tiago (ed.). Statistical Extremes and Applications. Kluwer. p. 462. ISBN 9789027718044. 9789027718044 ↩
Embrechts, Paul; Klüppelberg, Claudia; Mikosch, Thomas (1997-01-01). Modelling extremal events for insurance and finance. Springer. p. 162. ISBN 9783540609315. 9783540609315 ↩
Castillo, Enrique, and Ali S. Hadi. "Fitting the generalized Pareto distribution to data." Journal of the American Statistical Association 92.440 (1997): 1609-1620. ↩